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(57) ABSTRACT 

A multilingual Domain Name System allows users to use 
Domain Names in non-Unicode or ASCII encodings. An 
international DNS server (or iDNS server) receives multi- 
lingual DNS requests and converts them to a format that can 
be used in the conventional Domain Name System. When 
the iDNS server first receives a DNS request, it determines 
the encoding type of that request. It may do this by consid- 
ering the bit string in the top-level domain (or other portion) 
of the Domain Name and matching that string against a list 
of known bit strings for known top-level domains of various 
encoding types. One entry in the list may be the bit string for 
".com" in Chinese BIG5, for example. After the iDNS server 
identifies the encoding type of the Domain Name, it converts 
the encoding of the Domain Name to Unicode. It then 
translates the Unicode representation to an ASCII represen- 
tation conforming to the universal DNS standard. This is 
then passed into a conventional Domain Name System, 
which recognizes the ASCII format Domain Name and 
returns the associated IP address. 

16 Claims, 8 Drawing Sheets 
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MULTI-LANGUAGE DOMAIN NAME 
SERVICE 

BACKGROUND OF THE INVENTION 

The present invention relates to the Domain Name Ser- 
vice used to resolve network domain names into correspond- 
ing network addresses. More particularly, the invention 
relates to an alternative or modified Domain Name Service 
that accepts domain names provided in many different 
encoding formats, not just ASCII. 

The Internet has evolved from a purely research and 
academic entity to a global network that reaches a diverse 
community with different languages and cultures. In all 
areas the Internet has progressed to address the localization 
needs of its audience. Today, electronic mail is exchanged in 
most languages. Content on the World Wide Web is now 
published in many different languages as multilingual- 
enabled software applications proliferate. It is possible to 
send an e-mail message to another person in Chinese or to 
view a World Wide Web page in Japanese. 

The Internet today relies entirely on the Domain Name 
System to resolve human readable names to numeric IP 
addresses and vice versa. The Domain Name System (DNS) 
is still based on a subset of Latin-1 alphabet, thus still mainly 
English. To provide universality, e-mail addresses, Web 
addresses, and other Internet addressing formats adopt 
ASCII as the global standard to guarantee interoperation. No 
provision is made to allow for e-mail or Web addresses to be 
in a non-ASCII native language. The implication is that any 
user of the Internet has to have some basic knowledge of 
ASCII characters. 

While this does not pose a problem to technical or 
business users who, generally speaking, are able to under- 
stand English as an international language of science, 
technology, business and politics, it is a stumbling block to 
the rapid proliferation of the Internet to countries where 
English is not widely spoken. In those countries, the Internet 
neophyte must understand basic English as a prerequisite to 
send e-mail in her own native language because the e-mail 
address cannot support the native language even though the 
e-mail application can. Corporate intranets have to use 
ASCII to name their department domain names and Web 
documents simply because the protocols do not support 
anything other ASCII in the domain name field even though 
filenames and directory paths can be multilingual in the 
native locale. 

Moreover, users of European languages have to approxi- 
mate their domain names without accents and so on. A 
company like Citroen wishing to have a corporate identity 
has to approximate itself to the closest ASCII equivalent and 
use "www.citroen.fr" and Mr Francois from France has to 
constantly bear the irritation of deliberately mis-typing his 
e-mail address as "francois@email.fr" (as a fictitious 
example). 

Currently, user-ids in an e-mail address field can be in 
multilingual scripts as operating systems can be localized to 
provide fonts in the relevant locale. Directories and filena- 
mes too can also be rendered in multilingual scripts. 
However, the domain name portion of these names are 
restricted to those permitted by the Internet standard in 
RFC1035, the standard setting forth the Domain Name 
System. 

One justifiable reason for this situation could be that 
software developers tended to use overlapping codes. For 
example, the Chinese BIG5 and GB2312 encodings (i.e., 
digital representations of glyphs or characters) overlap, so 
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do the Japanese JIS and Shift-JIS and the Korean KSC5601, 
just to name a few. As a result, one cannot easily tell the 
difference between encodings of BIGS with JIS or GB2312 
with KSC5601 unless an additional parameter specifying the 
5 encoding is included to inform the application client which 
encoding is being used. Therefore to ensure uniqueness of 
domain names and certainty of encoding, DNS has stuck to 
ASCII. 

Based on RFC1035, valid domain names are currently 

10 restricted to a subset of the ISO-8859 Latin 1 alphabet, 
which comprises the alphabet letters A-Z (case insensitive), 
numbers 0-9 and the hyphenation symbol (-) only. This 
restriction effectively makes a domain name support English 
or languages with a romanized form, such as Malay or 

15 Romaji in Japanese, or a roman transliteration, such as 
transliterated Tamil. No other script is acceptable; even the 
extended ASCII characters cannot be used. 

Unicode is a character encoding system in which nearly 
every character of most important languages is uniquely 

20 mapped to a 16 bit value. Since Unicode has laid down the 
foundations for unique non-overlapping encoding system, 
some researchers have begun to explore how Unicode can be 
used as the basis for a future DNS namespace, which can 
embrace the rich diversity of languages present in the world 

25 today. See M. Durst, "Internationalization of Domain 
Names," Internet Draft "draft-duerst-dns-il8n-02.txt," 
which can be found at the IETF home page, http:// 
www.ietf.cnri.reston.va.us/ID.html, July 1998. This docu- 
ment is incorporated herein by reference in its entirety and 

30 for all purposes. The new namespace should be able to offer 
multilingual and multiscript functionality that will make it 
easier for non-English speakers to use the Internet. 
Adopting Unicode as the standard character set for a new 

35 Domain Name System avoids overlapping code space for 
different language scripts. In this way, it may allow the 
Internet community to use domain names in their native 
scripts such as: 
www.citroen.ch 

40 www. geneve-city.ch 

Unfortunately, several difliculties would preclude modi- 
fying the DNS server and client applications to implement a 
multilingual Domain Name System. For example, all future 
client applications and all future DNS servers have to be 

45 modified. As both client and server have to be modified for 
the system to work, the transition from the old system to the 
new system could be difficult. Further, very few available 
client applications use native Unicode. Instead, most mul- 
tilingual client applications use non-Unicode encodings, and 

50 have strong followings. 

In view of these and other issues, it would be highly 
desirable to have a technique allowing the many linguistic 
encodings to be used in the DNS system. 

55 SUMMARY OF THE INVENTION 

The present invention provides systems and methods for 
implementing a multilingual Domain Name System allow- 
ing users to use Domain Names in non-Unicode and non- 
ASCII encodings. While the method may be implemented in 

60 various systems or combination of systems, for now the 
implementing system will be referred to as an international 
DNS server (or "iDNS" server). When the iDNS server first 
receives a DNS request, it determines the encoding type of 
that request. It may do this by considering the bit string in 

65 the top-level domain of the Domain Name and matching that 
string against a list of known bit strings for known top-level 
domains of various encoding types. One entry in the list may 
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be the bit string for ".com" in Chinese BIGS, for example. the system should convert the domain name's digital 

After the iDNS server identifies the encoding type of the sequence from the identified encoding type to a DNS 

Domain Name, it converts the encoding of the Domain encoding type compatible with DNS protocol (e.g., ASCII or 

Name to a universal linguistic encoding type (e.g., Unicode). possibly Unicode or some other universal encoding in the 

It then translates the universal linguistic encoding type 5 future). In a preferred embodiment, this conversion takes 

representation to an ASCII representation conforming to the P lace m t™ 0 operations: (i) converting the domain names 

universal DNS standard. This is then passed into a conven- di S ital sequence from the identified encoding type to a 

tional Domain Name System, which recognizes the ASCII universal linguistic encoding type; and (ii) converting the 

format Domain Name and returns the associated IP address. domain ^ s digital sequence from the universal hnguis- 

^ - , . . . , , , r , in tic encoding type to a DNS encoding type compatible with 

One aspect of the invention provides a method of detect- 1U me DNS protocol 

ing the linguistic encoding .type of a digitally represented ^ mvention ^ provides a ma ing table that asso . 

domain name. The method may be characterized by the ciates particular linguistic encoding types with particular 

following sequence: (a) receiving the digital sequence of a digital sequences. The mapping table includes a plurality of 

prespecified portion (e.g., a top-level domain) of the digi- records, each including the following attributes: (a) a known 

tally represented domain name; (b) matching the digital digital sequence of a prespecified portion of a digitally 

sequence from the domain name with a known digital represented domain name; and (b) a linguistic encoding type 

sequence from a collection of known digital sequences; and associated with the known digital sequence. The prespeci- 

(c) identifing an encoding type associated with the known fied portion of the digitally represented domain name may 

digital sequence matching the digital sequence from the be the digital sequence of the root level domain in the 

domain name. Each of the known digital sequences used in 20 domain name. The records may also include a top level level 

(b) is associated with a particular linguistic encoding type. DNS server responsible for resolving top-level level 

Note that the collection of known digital sequences includes domains of the linguistic encoding type in the record. Still 

known digital sequences for at least two different linguistic further, the mapping table may specify the type of transfor- 

encoding types mation required to convert domain names from a non-DNS 

It will often be convenient to provide the collection in a 25 type t0 8 DNS oon, P li,m encodin S <«*• 

table containing records having attributes including known m.- • *• i t * * ,i_ . u 

, j * • t w ,i . » j . This invention also relates to an apparatus that may be 

digital sequences and encoding types. In this case, identi- . * • j u .u * « • r *. / \ 

j. . . . & J f. . . ' - . characterized by the following features: (a) one or more 

fying the encoding type requires identifying the encoding /t v 3 ,5 4 t1 , w flL 

/ , f i . ♦ u- T a- •* i processors; (b) memory coupled to at least one of the one or 

type of a record having the matching known digital r w j / \ *, i • * _r 

sequence. Examples of encoding types represented in the 30 m ° rc P"^^. ««« more ne^ork mterfaces 

table include ASCII, BIGS, GB2312, shtft-JIS, EUC-JP, Capable ° f reCei ™|> 3 ^ DN f ^quest including a domain 

KSC5601, and extended ASCII. Dame 10 *. «*f®S encoding type and transmitting a DNS 

request with the domain name m a DNS encoding type that 

When at least two known digital sequences match the ^ compatible with the DNS protocol. At least one of the one 

digital sequence from the domain name, it will be necessary 35 or more processors will be designed or configured to convert 

to resolve the ambiguity. This may be accomplished by (a) the domain name in the non . DNS encoding type to that 

receiving the digital sequence of a second portion of the domain name in the DNS encoding type. The one or more 

digitally represented domain name; (b) decoding the digital network interfaces should be coupled to a network in a 

sequence of the second portion multiple times, each time manner allowing the apparatus to receive client DNS 

using a decoding scheme of a different one of the linguistic ^ requests presenting the domain name in the non-DNS encod- 

encoding types, each associated with the at least two known mg typc . Furt hcr, the one or more network interfaces should 

digital sequences; and (c) identifying the decoding that gives be t0 the network in a manner allowing the appa- 

the best result Alternatively, the ambiguity may be resolved ratus t0 transmit a DNS request t0 a standard DNS K 

by first matching an extended digital sequence (including with the DNS request presenting the domain name in the 

both the first and second portions of the domain name) and 4S encoding type 

then matching that extended sequence against known digital ^ ^ eferabJ ^ indudeg a { ^ 

sequences that may correspond to the extended sequence. In possibly like one of those described above) residing, at least 

this case, the collection of known digital sequences must m part , 0D the memory. Further, at least one processor should 

include some of the extended sequences. be ooaA&md or designed t0 idemify the n ^. DNS encoding 

In a specific embodiment, the collection of records 50 type of the domain name prior to converting that domain 

include a digital sequence (or representation of a digital aar ne from the non-DNS encoding type to the DNS encod- 

sequence) of a "minimum code resolving string" (MCRS). ing type. 

This is a digital sequence for a portion of a domain name and and other features and advantages of the present 

is known to distinguish that domain name— in a particular invention will be described in more detail below with 

encoding type— from every other domain name/encoding 55 re f ere nce to the drawings, 
type combination in the collection. The MCRS may be a 

sub-string of the top-level domain, a super-string of the BRIEF DESCRIPTION OF THE DRAWINGS 

top-level domain, overflow to the second and third level FIG. 1 is a schematic illustration of a network architecture 

domains, etc., so long as ambiguity is avoided when match- including an iDNS server positioned between a DNS server 

ing takes place. 60 and a client. 

As mentioned, the method is particularly applicable to FIG. 2 is a process flow diagram depicting the resolution 

handling DNS requests. Thus, the method may also involve of a DNS request presenting a Domain Name in a non-DNS 

(i) receiving a DNS request containing the digitally repre- encoding type, in accordance with one embodiment of the 

sented domain name; (ii) identifying a root level DNS server present invention. 

responsible for resolving root level domains of the identified 65 FIG. 3A is process flow diagram depicting a process for 

encoding type; and (iii) transmitting the DNS request to the converting a Domain Name in a non-DNS encoding type to 

root level DNS server. Prior to transmitting the DNS request, a corresponding Domain Name a DNS encoding type. 
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FIG. 3B is an illustration of the logical components of an be listed under the us. Country domain as cs.ucb.ct.us. In 

iDNS system. practice, however, nearly all organizations in the United 

FIG. 4 is a process flow diagram depicting a process for States are under a generic domain, and nearly all outside the 

determining the encoding type of a Domain Name. United States are the domain of their country. There 

„. „ _ . .„ , - , . . . . , . , t 5 is no rule against registering under two top-level domains, 

FIG 5 a an lUustration of a logical mapping table used to bu , doin & oool ^ so few OI | aDizations do it . 

identify encodmg types of domain names m accordance with n ua • ♦ t u « * *u j • j 

i j ■ f v*u- • ** Each domain controls how it allocates the domains under 

one embodiment of this invention. - t Fof example> Japan faas ^ and CQ jp ^ 

FIG. 6 is a "tree" diagram depicting a hierarchy of m ij. ror e d u an d com. To create a new domain, permission is 

Chinese language encodings. jq required of the domain in which it will be included. For 

FIG. 7 is a block diagram of a general-purpose computer example, if an artificial intelligence group is started at the 

system that may be employed to implement iDNS functions University of California at Berkeley and wants to be known 

of the present invention. as ai.cs.ucb.edu it needs permission from whomever man- 

DETAILED DESCRIPTION OF THE ages cs.ucb.edu Similarly, if a new university is chartered, 

PREFERRED EMBODIMENTS is sa * the V™""? f ^ ™<*' 11 ™>* »* the mana g er 

of the edu domain to assign it ulth.edu. In this way, name 

1. DNS and Unicode conflicts are avoided and each domain can keep track of all 

The present invention transforms multilingual multiscript its sub-domains. Once a new domain has been created and 

names to a form that is compliant with DNS (e.g., DNS as registered, it can create its own sub-domain, such as 

explained in RFC1035 as of 1999). These transformed cs.ulth.edu, without getting permission from any entity 

names may then be relayed as DNS queries to a conventional 20 higher up in the tree. 

DNS server. An exemplary process of how a localized i n theory, at least, a single name server could contain the 

domain name is resolved to its numeric IP address is entire DNS database and respond to all queries about it. In 

illustrated by FIG. 1 below. However, before FIG. 1 is practice, this server would be so overloaded as to be useless, 

described, a few underlying principles and terms will be Furthermore, if it ever went down, the entire Internet would 

discussed. b e crippled. To avoid the problems associated with having 

Programs rarely refer to hosts, and other resources by only a single source of information, the DNS name space is 

their binary network addresses. Instead of binary numbers, divided into non-overlapping "zones." Each zone contains 

they use ASCII strings, such as www.pobox.org.sg. some part of me tree and also contains name servers holding 

Nevertheless, the network itself only understands binary 3Q the authoritative information about that zone. Normally, a 

addresses, so some mechanism is required to convert the zone will have one primary name server, which gets its 

ASCII strings to network addresses. This mechanism is information from a file on its disk, and one or more 

provided by the Domain Name System. secondary name servers, which get their information from 

The essence of DNS is a hierarchical, domain-based the primary name server, 

naming scheme and a distributed database system for imple- 35 When a resolver gets a query about a domain name, it 

menting this naming scheme. It is primarily used for map- passes the query to one of the local name servers. If the 

ping host names and e-mail destinations to IP addresses, but domain being sought falls under the jurisdiction of the name 

can be used for other purposes. As mentioned, DNS is server, such as ai.cs.ucb.edu falling under cs.ucb.edu, it 

defined in RFCs 1034 and 1035. returns the authoritative resource records. An authoritative 

Very briefly, the way DNS is used is as follows. To map ^ record is one that comes from the authority that manages the 

a name onto an IP address, an application program calls a record, and is thus always correct. A given name server may 

library procedure called the "resolver," passing it the name also contain "cached records," which may be out of date, 

as a parameter. The resolver sends a UDP packet to a local if the domain of interest is remote and no information 

DNS server, which then looks up the name and returns the about the requested domain is available locally, the name 

IP address to the resolver, which then returns it to the caller. 45 server sends a query message to the top-level name server 

With the IP address in hand, the program can establish a TCP for the domain requested. For example, a local name server 

connection with the destination or send it UDP packets. seeking to find the IP address for ai.cs.ucb.edu may send a 

Conceptually, the Internet is divided into many top-level UDP packet to the server for edu given in its database, 

"domains," for each domain covers many hosts. Each eduserver.net. It is unlikely that this server knows the 

domain is partitioned into sub-domains and these are further 50 address of ai.cs.ucb.edu, and probably does not know cs. 

partitioned, and so on. All these domains can be represented ucb.edu either, but it must know all of its own children, so 

by a tree. The leaves of the tree represent domains that have it forwards the request to the name server for ucb.edu. In 

no sub-domains (but do contain machines, of course). A leaf turn, this one forwards the request to cs.ucb.edu that must 

domain may contain a single host, or it may represent a have the authoritative resource records. Since each request 

company that contains thousands of hosts. 55 is from a client to a server, the authoritative record requested 

The top-level domains come in two flavors: generic and works its way back to the original name server requesting 

countries. The generic domains are com (commercial), edu the IP address for ai.cs.ucb.edu. 

(educational institutions), gov (the united states federal Once the record gets back to the original name server, it 

government), int (certain international organizations), mil will be entered into a cache there, in case it is needed later, 

(the united states armed forces), net (network providers), eo However, this information is not authoritative, since changes 

and org (organizations). The country domains include one made at cs.usb.edu will not be propagated to all the caches 

entry for every country, as defined in IS 03 166. Each domain in the world that may know about it. For this reason, a cache 

is named by the path upward from it to the unnamed root. entry should be removed or updated frequently. This may be 

The components are separated by periods (pronounced accomplished with a "time_to_Jive" field included in each 

"dot"). 65 record. 

In principal, domains can be inserted into the tree in two The above example of a method for resolving a domain 

different ways. For example, cs.ucb.edu could equally well name is referred to as recursive querying. Other techniques 
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exist. For more detail on DNS, see Andrew S. Tanenbaum, 
"Computer Networks," 3 rd Ed., Prentice Hall, Upper Saddle 
River, N J. (1996) from which much of the above discussion 
was adapted. See also U.D. Black, "TCP/IP and Related 
Protocols," 3 rd Ed., McGraw-Hill, San Francisco, Calif. 5 
(1998). Both of these references are incorporated herein by 
reference for all purposes. 

As noted, the DNS protocol is currently based upon a 
subset of ASCII, and is thus limited to the Latin alphabet. 
Numerous other encodings provide digital representations 1Q 
for other character sets of the world. Examples include BIGS 
and GB-2312 for Chinese character scripts (traditional and 
simplified respectively), Shift-JIS and EUC-JP for Japanese 
character scripts, KSC-5601 for Korean character scripts, 
and the extended ASCII characters for French and German 
characters, for instance. 

Beyond these language-specific encoding types, there 
exists the Unicode standard (a "universal linguistic encoding 
type") that provides the capacity to encode all the characters 
used in the written languages of the world. It uses a 16-bit 2Q 
encoding that provides code points for more than 65,000 
characters. Unicode scripts include Latin, Greek, Cyrillic, 
Armenian, Hebrew, Arabic, Devanagari, Bengali, Gunnukhi, 
Gujarati, Oriya, Tamil, Telugu, Kannada, Malay alam, Thia, 
Lao, Georgian, Tibetan, Japanese Kana, the complete set of 25 
modern Korean Hangul, and a unified set of Chinese/ 
Japanese/Korean (CJK) ideographs. Many more scripts and 
characters are to be added shortly, including Ethiopic, 
Canadian, Syllabics, Cherokee, additional rare ideographs, 
Sinhala, Syriac, Burmese, Khmer, and Braille. 3Q 

A single 16-bit number is assigned to each code element 
defined by the Unicode Standard. Each of these 16-bit 
numbers is called a code value and, when referred to in text, 
is listed in hexadecimal form following the prefix "U". For 
example, the code value U+0041 is the hexadecimal number 35 
0041 (equal to the decimal number 65). It represents the 
character "A" in the Unicode Standard. 

Each character is also assigned a unique name that 
specifies it and no other. For example, U+0041 is assigned 
the character name "LATIN CAPITAL LETTER A." 40 
U+0A1B is assigned the character name "GURMUKHI 
LETTER CHA " These Unicode names are identical to the 
ISO/IEC 10646 names for the same characters. 

The Unicode Standard groups characters together by 
scripts in code blocks. A script is any system of related 45 
characters. The standard retains the order of characters in a 
source set where possible. When the characters of a script 
are traditionally arranged in a certain order — alphabetic 
order, for example — the Unicode Standard arranges them in 
its code space using the same order whenever possible. Code 50 
blocks vary greatly in size. For example, the Cyrillic code 
block does not exceed 256 code values, while the CJK code 
block has a range of thousands of code values. 

Code elements are grouped logically throughout the range 
of code values, called the "codespace." The coding starts at 55 
U+0000 with the standard ASCII characters, and continues 
with Greek, Cyrillic, Hebrew, Arabic, Indie and other 
scripts; then followed by symbols and punctuation. The code 
space continues with Hiragana, Katakana, and Bopomofo. 
The unified Han ideographs are followed by the complete set 60 
of modern Hangul. The surrogate range of code values is 
reserved for future expansion with UTF-16. Towards the end 
of the codespace is a range of code values reserved for 
private use, followed by a range of compatibility characters. 
The compatibility characters are character variants that are 65 
encoded only to enable transcoding to earlier standards and 
old implementations which made use of them. 



Character encoding standards define not only the identity 
of each character and its numeric value, or code position, but 
also how this value is represented in bits. The Unicode 
Standard endorses at leas t three forms that correspond to 
ISO 10646 transformation formats, UTF-7, UTF-8 and 
UTF- 1 6. 

The ISO/IEC 10646 transformation formats UTF-7, 
UTF-8 and UTF-16 are essentially ways of turning the 
encoding into the actual bits that are used in implementation. 
UTF-16 assumes 16-bit characters and allows for a certain 
range of characters to be used as an extension mechanism in 
order to access an additional million characters using 16-bit 
character pairs. The Unicode Standard, Version 2.0, Addison 
Wesley Longman (1996) (with updates and additions added 
via "The Unicode Standard, Version 2.1) has adopted this 
transformation format as defined in ISO/IEC 10646. This 
reference is incorporated herein by reference in its entirety 
and for all purposes. 

The second transformation format is known as UTF-8. 
This is a way of transforming all Unicode characters into a 
variable length encoding of bytes. It has the advantages that 
the Unicode characters corresponding to the familiar ASCII 
set end up having the same byte values as ASCII, and that 
Unicode characters transformed into UTF-8 can be used 
with much existing software without extensive software 
rewrites. The Unicode Consortium also endorses the use of 
UTF-8 as a way of implementing the Unicode Standard. Any 
Unicode character expressed in the 16-bit UTF-16 form can 
be converted to the UTF-8 form and back without loss of 
information. The Unicode Standard specifies unambiguous 
requirements for conformance in terms of the principles and 
encoding architecture it embodies. A conforming implemen- 
tation has the following characteristics, as a minimum 
requirement: 

characters are 16-bit units; 

characters are interpreted with Unicode semantics; 

unassigned codes are not used; and, 

unknown characters are not corrupted. 

UTF-8 implementations of the Unicode Standard are 
conformant as long as they treat each UTF-8 encoding of a 
Unicode character (sequence of bytes) as if it were the 
corresponding 16-bit unit and otherwise interpret characters 
according to the Unicode specification. The full conform- 
ance requirements are available within The Unicode 
Standard, Version 2.0, Addison Wesley Longman, 1996, 
previously incorporated by re fere nee. UTF-7 is designed to 
provide 7 bit characters that are useful for 7 bit media/ 
transport. Email as specified in RFC 822, for example, is a 
7 bit system. UTF-16 is designed for 16 bit media/transport 
and UTF-8 is designed for 8 bit media/transport. Most of the 
Internet is 8 bit transportable, but there are legacy systems 
using 7 bits (e.g., DNS, SMTP email, etc.). 

2. Terminology 

Some of the terms used herein are not commonly used in 
the art. Other terms have multiple meanings in the art. 
Therefore, the following definitions are provided as an aide 
to understanding the description that follows. The invention 
as set forth in the claims should not necessarily be limited to 
these definitions. 

Linguistic encoding type — any character or glyph encod- 
ing type (e.g., ASCII or BIG5) now known or used in the 
future. 

Universal linguistic encoding type — any linguistic encod- 
es type* now known or developed in the future, that 
encompasses more than one character or glyph set within its 
encoding. Unicode is one example. BIGS, iso-8859-11, and 
GB-2312 are others. 
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Digitally represented — the way characters are presented according to conventional DNS protocol and forwards the 

as a result of encoding (e.g., in a bit stream, a hexadecimal address back to iDNS server 16. The iDNS server 16, in turn, 

format, etc.) transmits the needed network address back to client 12, 

Digital sequence — a particular sequence of ones and where it is placed in the student's message. The message is 

zeros, hexadecimal characters, or other constituents in a 5 packetized, with each packet having a destination network 

digital representation. address corresponding to node 14. Client 12 then sends the 

"Portion" of a digitally represented domain name— any message packets over the Internet to node 14. 

section or a whole of a domain name; e.g., the top-level ^ Procedure can be understood more fully by consid- 

domain, the second level domain, and the top and second the operations described in the interaction process flow 

level domain together 10 d^S 1 " 4 ™ °* ^. As shown there, client 12 is depicted by 

» j- w i -i - ** i c * . a vertical line on the left-hand side of the figure, iDNS server 

Known digital sequence-a digital sequence of interest u fa icted b a ^ {n the ^ ■ 

because it is known to be associated with some commonly and DN £ ^ 18 fe d icted b a vertical ^ Q * ^ 

used character combination (or other property of domain right-hand side of the figure. 

names) encoded in a particular encoding type (e.g., the BIG5 InitiaUy ^ at 203j m applicatioil on client u 

digital sequence for ".com ). is generates a message intended for a network destination. The 

"Collection" of known digital sequences— any arrange- domain name for that destination is input in non-DNS 

ment of or connection between multiple known digital compatible text encoding format. Thus, the text is encoded 

sequences. Typically, though not necessarily, stored together in a linguistic encoding type that digitally represents the 

logically as a table (e.g., a "mapping table" described characters of the text. As mentioned, ASCII is but one 

herein). 20 linguistic encoding type. In preferred embodiments, the 

DNS encoding type — an encoding type supported by the invention handles a wide range of encoding types. Examples 

DNS protocol of a network or Internet, e.g., a limited set of of some in wide use include GB2312, BIGS, Shift-US, 

ASCII specified in RFC 1035. EUC-JP, KSC5601, extended ASCII, and others. 

Non-DNS encoding type — an encoding type not sup- After the client application creates the message at 203, the 

ported by the DNS protocol under consideration, e.g., BIGS 25 client operating system creates a DNS request to resolve the 

under RFC 1035. domain name at 205. The DNS request may resemble a 

3. Implementations of iDNS conventional DNS request in most regards. However, the 

'Riming now to FIG. 1, some important components of a dom ^ n name P«? vide r d in me request ^ be provided in a 

network 10 used in an embodiment of this invention include ? 0Q ~ D ™ .f D ^ m§ format * ™* cbent * y ^ cm 

a client 12, a corresponding node 14 with whom client 12 30 ^ i P I ?* UC * '? ^ *T™ H * T Not 5 

, • . _ nn . *\ _ . nNTC , that the client operating system may be configured to send 

wide, to commun cat^an iDNS server 16 and a conven- DNS s ,^5^, 16 In y other wof £ ^ 

tional DNS server 18 The IDNS server 16 may bsten on a DNS ^ of dient n h ^ sefver lfi 

DNS por (currently addressed to the domain name port 53) n( . iDNS 

server 16 extracts the encoded domain name 

for multilingual domain name queries in place of a normal hom the DNS request ^ generates a transformed DNS 

DNS server, which may include the Berkeley Internet Name 35 request presenting the domain name in a DNS compatible 

Domain ('BIND* and its executable version 'named') which encoding format (presently the reduced set ASCII specified 

is a widely used DNS server written by Paul Vixie (http:// in RFC 1035). See 209. The iDNS server 16 then transmits 

www.isc.org/). its DNS request to conventional DNS name server 18. See 

To understand the role of these components, assume that 211. The name server then uses a conventional DNS proto- 

client 12 is used by a Chinese student who wishes to inquire 40 col to obtain the IP address of the domain name used in the 

about employment in a Hong Kong business that operates client's communication. See 213. Then, at 215, the name 

corresponding node 14. The student has previously commu- server replies to the iDNS server with the requested IP 

nicated with the business and has obtained the domain name address. The iDNS server 16 then transmits the IP address 

of that business. The domain name is provided in native back to client 12 at 217. Finally, client 12, with IP address 

Chinese characters. Client 12 is outfitted with a keyboard 45 now in hand, sends its communication to the intended 

that can type Chinese language characters and is configured destination. See 219. 

with software that can recognize encoded Chinese characters As indicated above, the domain name must, at some point, 
and accurately display them on a computer screen. be converted from a non-DNS encoding type to a DNS 
Now, the student prepares a message to the Hong Kong compatible encoding type. In the above examples, this is 
business, encloses her resume, and types in the Chinese 50 accomplished with a proxy iDNS server. This need not be 
domain name as the destination. When she instructs client 12 the case, however, as the functionality necessary for con- 
to send the message to corresponding node 14, the system version may be embodied in the client or the conventional 
shown in FIG. 1 takes the following actions. First, the DNS server, as well. 

corresponding node domain name is submitted, in the native In alternative embodiments, the functions performed by 

language, to iDNS server 16 via a DNS request. The iDNS 55 the proxy iDNS server are implemented in whole (or in part) 

server 16 recognizes that the domain name is not in a format on the client and/or on the DNS server. In one embodiment, 

that can be handled by a conventional DNS server. Therefore operations including detecting an encoding type, translating 

it translates the Chinese domain name to a format that can a non-DNS encoded domain to a DNS encoded domain 

be used with a conventional DNS server (normally a limited name and identifying a default name server (operations 

set of the ASCII characters). The iDNS server 16 then 60 305-311 of the FIG. 3A flow chart discussed below) are 

repackages the DNS request, with the translated correspond- implemented on an Internet application (e.g., a multilingual- 

ing node domain name, and transmits that request to con- enabled Web browser). In this embodiment, code detection 

ventional DNS server 18. DNS server 18 then uses the and code conversion are automatically done prior to dis- 

normal DNS protocol to obtain a network address for the patching a DNS resolution request to a DNS server. In some 

domain name it received in the DNS request. The resulting 65 embodiments, the application can provide manually defined 

network address is the network address of corresponding linguistic encoding which obviates the need for code detec- 

node 14. DNS server 18 packages that network address tion. 
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Iq another alternative embodiment, operations 305-311 
can be implemented on the iDNS server. Other embodiments 
include collapsing all or some fraction of the operations of 
the proxy iDNS into the DNS server. For example, code for 
some iDNS functions can be collapsed into BIND code as a 5 
compilable module. 

In FIG. 2, the conversion of the domain name from one 
linguistic encoding type to a second linguistic encoding type 
(compatible with DNS) is performed at 209. As shown in 
FIG. 3 A, in accordance with a preferred embodiment of this 
invention, this conversion may take place via a process 301. 
The process begins at 303 with the system identifying the 
encoding type of the domain name in the DNS request. This 
is necessary when the system may be confronted with 
multiple different encoding types. After the encoding type 
has been identified, the system next determines whether the 15 
domain name was encoded in a DNS compatible encoding 
type at 305. Currently, that requires determining whether the 
domain name is encoded in the reduced set ASCII encoding 
type. If so, further conversion is unnecessary and process 
control is directed to 311, which will be described below. 20 

In the interesting case, the domain name is encoded in a 
non-DNS format. When this occurs, process control is 
directed to 307 where the system translates the domain name 
to a universal encoding type. In a preferred embodiment, this 
universal encoding type is Unicode. In this case, the char- 25 
acters identified in the native encoding type are identified in 
the Unicode standard and converted to the Unicode digital 
sequences for those characters. 

The newly translated domain name is then further trans- 
formed from the universal encoding type to a DNS compat- 30 
ible encoding type. See 309. Thus, this final encoding type 
may be reduced set ASCII. Note that the translation from the 
DNS incompatible format to the DNS compatible format 
takes place in two steps through an intermediate universal 
encoding type. This two step procedure will be detailed 35 
below. It should be understood, however, that it may be 
possible to directly convert, in one step, the DNS incom- 
patible domain name to the DNS compatible domain name. 
This may be accomplished in a system having multiple 
conversion algorithms, each designed to convert a specific 40 
encoding type to ASCII (or some other future DNS- 
compatible encoding type). In one example, these algo- 
rithms may be modeled after the "Durst algorithm" 
described above. Many other suitable algorithms are known 
or can be developed with routine effort. 45 

With a DNS compatible domain name now in hand, the 
system need only determine which conventional DNS name 
server it should forward the domain name to. According to 
normal DNS protocol, the DNS request might be forwarded 
to a top-level name server. As will be described in more 50 
detail below, it may be convenient to have different root 
name servers handle different linguistic domains. For 
example, the Chinese government may maintain a root name 
server for Chinese language domain names, the Japanese 
government or a Japanese corporation may maintain a root 55 
name server for Japanese language domain names, the 
Indian government may maintain a root name server for 
Hindi language domain names, etc. In any event, the system 
must identify the appropriate name server at 311 as indicated 
in FIG. 3A. After this has been accomplished, the conversion 60 
process is complete and the DNS request can be transmitted 
to the DNS system for handling according to convention. 

Preferably, the process depicted in FIG. 3A is performed 
solely on an iDNS server. However, some of the process 
may be performed on a client or a conventional DNS server. 65 
For example, 303 and 305 could be performed on a client 
and 309 could be performed on a conventional DNS server. 
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A preferred division of labor for the iDNS function (327) 
is depicted in FIG. 3B. As shown there, an iDNS mapper 
server 321 performs operations 305-311. To this end, it 
includes a mapping table (an example of which is described 
below with reference to FIG. 5) and can convert all linguistic 
encoding types to Unicode (or other suitable universal 
encoding type). In this embodiment, a client 325 performs 
operation 303 and a conventional DNS server 323 performs 
the standard DNS resolving protocol. 

In one implementation iDNS mapper server 321 runs on 
a machine (identified by i2.i-dns.com for example) on a 
designated port (e.g., a port number 2000). It accepts a 
whole portion of a digitally represented domain name in any 
linguistic encoding type and returns a whole portion of a 
digitally represented domain name in Unicode transformed 
to a DNS encoding type (UTF-5). Note that the mapping 
table and the conversion program code may be quite large, 
thereby increasing the size of DNS server 323 several fold 
(if implemented there). By separating operations 305-311 
from the DNS protocol and running it separately, the amount 
of code needed to distribute iDNS is reduced. 

As indicated in the discussion of FIG. 3A, when the 
system must handle multiple encoding types, it must be 
capable of distinguishing one encoding type from the next. 
This process was depicted at block 303 and is elaborated on 
in FIG. 4. 

As shown in FIG. 4, the process of identifying an encod- 
ing type 401 begins at 403 with the system identifying the 
digital sequence of the top-level domain of the domain 
name. In the system in place in March 1999, the top-level 
domains included .com, .edu, .gov, .mil, .org, .int, .net, and 
the various two letter country designations (e.g., .fr, .sg, .kr, 
etc.). 

After the digital sequence of the top-level domain has 
been identified, the system next matches that sequence to a 
particular encoding type. In a preferred embodiment, this 
involves matching the sequence against records in a map- 
ping table at 405. An exemplary mapping table will be 
described in more detail below. For now, simply recognize 
that the table (or other logical structure) includes a list of 
digital sequences for various top-level domains in the vari- 
ous linguistic encoding types handled by the system. Each 
separate record also includes an associated encoding type 
identifier. The system matches the digital sequence under 
consideration by simply comparing it against the sequences 
in the various records of the mapping table (using a standard 
database look up procedure such as a binary search, hash 
table, B-tree, etc.). This will typically provide a single 
match. However, if multiple entities are responsible for 
issuing top-level domains (each responsible for a different 
language, for example), then it is possible that the digital 
sequences for two top-level domains in different encoding 
formats could be identical. 

To address this possibility, the system determines, at 407, 
whether multiple records match the digital sequence under 
consideration. If not, the process is complete at 413 with the 
system deciding to use the encoding identified in the single 
matching record. If, on the other hand, two or more records 
match, the system must resolve this ambiguity. It does this 
by first identifying a lower-level domain (e.g., a subdomain 
such as a second level domain) digital sequence. See 409. In 
other words, the domain name under consideration will have 
a digital sequence associated with its lower level domains. 
The now expanded digital sequence is again matched against 
the digital sequences in the mapping table (405). Note that 
some records of the table may include digital sequences for 
the combination of top-level and lower level domains (to 
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resolve a potential ambiguity in the sequences of the top- figured with default name servers for resolving a domain 

level domains). After a match is found at 405, the process name. As shown in FIG. 6, under the root there are multiple 

proceeds through 407 as described above. top-level domains (e.g., .com, .edu, .sg, etc.). Under the .sg 

In an alternative embodiment, only the digital sequences top-level domain, there are multiple Chinese language 

for top-level domains are maintained in the mapping table. 5 second-level domains such as edu.sg, and under that, there 

No provision is made for extended sequences to resolve multiple domains including nus.edu.sg, and so on. Similarly, 

ambiguities. In this case, when 407 is answered in the under the top-level .com, there are multiple second-level 

affirmative (multiple records do match), the system identi- Chinese language sub-domains such as email.com. 

fics each of the potential matches (candidate encoding As noted in the discussion of the embodiment of FIG. 3A, 

types). The sequence under consideration is then decoded 10 lhe lDNS s y stem converts the universal encoding type (e.g., 

using each of the potential encoding types. For example, the Unicode) of the domain name to a DNS encoding type. In 

root domain digital sequence may have found a match for one preferred embodiment, this is accomplished using a 

net in one of the Japanese encoding types and .com in one transformation algorithm defined by the Internet draft, 

of the Chinese encoding types. "Internationalization of Domain Names", by Martin Durst, 

One of the decoded strings should be understandable in 15 previously incorporated by reference. The algorithm will 

the language of the candidate encoding type. The other(s) transform a variable length data entity to a form that consists 

should be gibberish. Thus, the system selects the candidate of onl y the RFC^complianl ASCII monocase alphabets and 

encoding type providing the best decoding of the secondary numbers. The table below shows the transformation table 

domain. The process is then concluded at 413 with the used in lhe Internet draft, 

system using the selected encoding type. 20 

As indicated at 405 in the discussion of FIG. 4, the iDNS 
server may match a digital sequence for a top-level domain 
of a domain name query against known digital sequences for 
multiple encoding types. A mapping table may house the 

known digital sequences. FIG. 5 provides a mapping table 25 
501 in accordance with one embodiment of this invention. 
Each record in table 501 specifies a minimum code resolving 
string (e.g., a top-level domain) for a particular encoding 
type (e.g., .com for BIG5). 

As shown, mapping table 501 includes six separate fields. 30 
The first of these is a time to live that specifies how long 
before the entry cache expires. Next, a minimum code 
resolving string field identifies the digital sequence of a 
portion of a domain name (e.g., the digital encoding for .com 

in BIGS). Note that the minimum code resolving string is 35 
typically provided as an 8 bit binary string. To simplify entry 
and maintenance of minimum code resolving strings in table 
501, a transformation may be applied to the binary string in 
order to get the form shown. 

While the minimum code resolving string may often be 40 The first two columns of the table are to be interpreted as 

the top-level domain, this need not be the case. For some binary (or hexadecimal) values while the last two columns 

linguistic encodings, it may be necessary to include the are to be interpreted as the ASCII RFC1035-compliant 

second or a higher level domain to uniquely resolve the type characters, 'initial' and 'subsequent* means the initial nibble 

of encoding given in the string because of an ambiguity. (half a byte) of the data entity and the rest of the data entity 

Similarly, it may not always be necessary to use the whole 45 respectively. If the data entity is 2 bytes long (as in the case 

top-level domain to uniquely determine the encoding type. of UCS-2), then there will be 4 nibbles in that particular data 

This speeds the search for a match. entity. 

The "authority" specified in the table is the entity given As indicated in the above discussion, to resolve a multi- 
authority over domain names specified in the record. This lingual domain name, a client application will submit the 
authority can register sub-domains under its authority. For 50 multilingual non-RFC-compliant query to an iDNS proxy 
example, if an "i-dns" entity is given authority over .com in server. This proxy server will then transform the query to an 
BIG5, it may have authority to issue all sub-domain names RFC-compliant format using this transformation algorithm 
under .com in BIGS. This ensures that only unique domain and submit this query to a DNS server, 
names are assigned. Also, the authority denotes an entity At the DNS server, there will be an entry for this RFC- 
having dominion over a name server (or servers) with 55 compliant query that maps to a valid address such as: 
"authoritative" records that provide IP addresses for domain U4B8O7E7RBB4U7BDP1.U696R0E5OAA0U59DQ1 
names in the authority's portion of DNS space. The "encod- IN A 12.34.56.78 

ing" field table 501 specifies the encoding type of the The DNS server will then return this IP address in 

domain name matching the record. The "transform" field accordance to RFC1035 to the iDNS proxy server. The 

specifies the final encoding of the domain name. For 60 proxy will then relay the message containing the correctly 

example, UTF-5 is the Durst algorithm applied to Unicode resolved IP address to the client. Note that the transformed 

(described below). Finally, a "comments" field contains a domain name (in ASCII) normally will have to be registered 

text string identifying what the portion of a domain name with the authority responsible for controlling and issuing 

corresponds to the minimum code resolving string. conventional DNS domain names. 

FIG. 6 illustrates an exemplary domain name tree for 65 Embodiments of the present invention relate to an appa- 

resolving Chinese language domain names. An iDNS server ratus for performing the above-described iDNS operations, 

detecting a Chinese language encoding type, will be con- This apparatus may be specially constructed (designed) for 
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the required purposes, or it may be a general-purpose 
computer selectively activated or reconfigured by a com- 
puter program stored in the computer. The processes pre- 
sented herein are not inherently related to any particular 
computer or other apparatus. In particular, various general- 5 
purpose machines may be used with programs written in 
accordance with the teachings herein, or it may be more 
convenient to construct a more specialized apparatus to 
perform the required method steps. The required structure 
for a variety of these machines will appear from the descrip- 10 
tion given above. 

In addition, embodiments of the present invention further 
relate to computer readable media that include program 
instructions for performing various computer-implemented 
operations. The media may also include, alone or in com- 15 
bination with the program instructions, data files, data 
structures, tables, and the like. The media and program 
instructions may be those specially designed and constructed 
for the purposes of the present invention, or they may be of 
the kind well known and available to those having skill in 20 
the computer software arts. Examples of computer-readable 
media include magnetic media such as hard disks, floppy 
disks, and magnetic tape; optical media such as CD-ROM 
disks; magneto-optical media such as floptical disks; and 
hardware devices that are specially configured to store and 25 
perform program instructions, such as read-only memory 
devices (ROM) and random access memory (RAM). The 
media may also be a transmission medium such as optical or 
metallic lines, wave guides, etc. including a carrier wave 
transmitting signals specifying the program instructions, 30 
data structures, etc. Examples of program instructions 
include both machine code, such as produced by a compiler, 
and files containing higher level code that may be executed 
by the computer using an interpreter. 

FIG. 7 illustrates a typical computer system in accordance 35 
with an embodiment of the present invention. The computer 
system 700 includes any number of processors 702 (also 
referred to as central processing units, or CPUs) that are 
coupled to storage devices including primary storage 706 
(typically a random access memory, or "RAM"), primary 40 
storage 704 (typically a read only memory, or "ROM"). As 
is well known in the art, primary storage 704 acts to transfer 
data and instructions uni-directionally to the CPU and 
primary storage 706 is used typically to transfer data and 
instructions in a bi-directional manner. Both of these pri- 45 
mary storage devices may include any suitable type of the 
computer-readable media described above. A mass storage 
device 708 is also coupled bi-directionally to CPU 702 and 
provides additional data storage capacity and may include 
any of the computer-readable media described above. The 50 
mass storage device 708 may be used to store programs, data 
and the like and is typically a secondary storage medium 
such as a hard disk that is slower than primary storage. It will 
be appreciated that the information retained within the mass 
storage device 708, may, in appropriate cases, be incorpo- 55 
rated in standard fashion as part of primary storage 706 as 
virtual memory. A specific mass storage device such as a 
CD-ROM 714 may also pass data uni-directionally to the 
CPU. 

CPU 702 is also coupled to an interface 710 that includes 60 
one or more input/output devices such as such as video 
monitors, track balls, mice, keyboards, microphones, touch- 
sensitive displays, transducer card readers, magnetic or 
paper tape readers, tablets, styluses, voice or handwriting 
recognizers, or other well-known input devices such as, of 65 
course, other computers. Finally, CPU 702 optionally may 
be coupled to a computer or telecommunications network 



using a network connection as shown generally at 712. With 
such a network connection, it is contemplated that the CPU 
might receive information from the network, or might output 
information to the network in the course of performing the 
above-described method steps. The above-described devices 
and materials will be familiar to those of skill in the 
computer hardware and software arts. 

The hardware elements described above may be config- 
ured (usually temporarily) to act as one or more software 
modules for performing the operations of this invention. For 
example, instructions for detecting an encoding type, trans- 
forming that encoding type, and identifying a default name 
server may be stored on mass storage device 708 or 714 and 
executed on CPU 708 in conjunction with primary memory 
706. 

Although the foregoing invention has been described in 
some detail for purposes of clarity of understanding, it will 
be apparent that certain changes and modifications may be 
practiced within the scope of the appended claims. 

What is claimed is: 

1. A method, implemented on an apparatus, of detecting 
the linguistic encoding type of a digitally represented 
domain name, the method comprising: 

receiving the digital sequence of a prespecified portion of 
the digitally represented domain name; 

matching said digital sequence from the domain name 
with a known digital sequence from a collection of 
known digital sequences, each associated with a par- 
ticular linguistic encoding type, and the collection 
including known digital sequences for at least two 
different linguistic encoding types; and 

identifying an encoding type associated with the known 
digital sequence matching the digital sequence from the 
domain name. 

2. The method of claim 1, further comprising receiving a 
DNS request containing the digitally represented domain 
name. 

3. The method of claim 1, wherein the prespecified 
portion of the digitally represented domain name is a mini- 
mum code resolving string in the domain name. 

4. The method of claim 1, further comprising transform- 
ing the format of the digital sequence of the digitally 
represented domain name prior to matching that digital 
sequence. 

5. The method of claim 1, wherein the collection of 
known digital sequences is provided in a table containing 
records having attributes including known digital sequences 
and encoding types, 

6. The method of claim 5, wherein the table includes 
records having at least the following encoding types: ASCII, 
BIG5, GB2312, shift-JIS, EUC-JP, KSC5601, and extended 
ASCII. 

7. The method of claim 5, wherein identifying the encod- 
ing type comprises identifying the encoding type of a record 
having the matching known digital sequence. 

8. The method of claim 1, wherein at least two known 
digital sequences match the digital sequence from the 
domain name, and further comprising: 

receiving the digital sequence of a second portion of the 

digitally represented domain name; and 
matching the digital sequence of the second portion with 

a known digital sequence from the collection of known 

digital sequences. 

9. The method of claim 2, further comprising: 
identifying a root level DNS server responsible for resolv- 
ing root level domains of the identified encoding type; 
and 

transmitting the DNS request to the root level DNS server. 
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10. The method of claim 9, further comprising, prior to 
transmitting the DNS request, converting the domain name's 
digital sequence from the identified encoding type to a DNS 
encoding type compatible with DNS protocol. 

11. The method of claim 10, wherein the DNS encoding 5 
type is ASCII or a universal linguistic encoding type. 

12. The method of claim 10, wherein converting the 
domain name's digital sequence comprises: 

converting the domain name's digital sequence from the 1Q 
identified encoding type to a universal linguistic encod- 
ing type; and 

converting the domain name's digital sequence from the 
universal linguistic encoding type to a DNS encoding 
type compatible with the DNS protocol 15 

13. A computer program product comprising a machine 
readable medium on which is provided program instructions 
for performing a method of detecting the linguistic encoding 
type of a digitally represented domain name, the method 
comprising: 20 

receiving the digital sequence of a prespecified portion of 
the digitally represented domain name; 

matching said digital sequence from the domain name 
with a known digital sequence from a collection of 2 s 
known digital sequences, each associated with a par- 
ticular linguistic encoding type, and the collection 
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including known digital sequences for at least two 
different linguistic encoding types; and 
identifying an encoding type associated with the known 
digital sequence matching the digital sequence from the 
domain name. 

14. The computer program product of claim 13, wherein 
the collection of known digital sequences is provided in a 
table containing records having attributes including known 
digital sequences and encoding types. 

15. The computer program product of claim 13, further 
comprising program instructions for the following: 

receiving a DNS request containing the digitally repre- 
sented domain name; 

identifying a root level DNS server responsible for resolv- 
ing root level domains of the identified encoding type; 
and 

transmitting the DNS request to the root level DNS server. 

16. The computer program product of claim 15, further 
comprising program instructions for the following: 

prior to transmitting the DNS request, converting the 
domain name's digital sequence from the identified 
encoding type to a DNS encoding type compatible with 
DNS protocol. 
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